The brief structure leading to the milestone of PCA is as below:
The course today gave us a new term, covariance.
What is covariance? Why do need to learn covariance?
In addition to observing the mean value and variance of a single dataset, to understand the relationsip between two datasets is another interesting facet.
Two datasets with the same variance and mean value may have positive or negative relationship or nothing to do with each other.
Below is the code to display the variance, mean value, covariance:
import numpy as np
array_a = np.arange(1, 8, 1)
array_b = np.arange(7, 0, -1)
var_a, var_b = np.var(array_a), np.var(array_b)
print("Variance for dataset a and b")
print(var_a, var_b)
mean_a, mean_b = np.average(array_a), np.average(array_b)
print("Mean Value for dataset a and b")
print(mean_a, mean_b)
cov_a_b = np.cov(array_a, array_b)
print("Covariance for dataset a and b")
print(cov_a_b)